In The Claims: 



1. (Currently Amended) A system for cataloguing electronic information, 
comprising: 

an electronic device that captures audio /video data corresponding to a 
photographic target, said audio /video data including a narration 
concurrently provided by a narrator specifically to identify where 
respective subject matter locations are positioned in said 
audio /video data; 

a speech recognition engine that automatically performs a speech 

recognition process upon said narration to generate labels that 
correspond to said respective subject matter locations in said 
audio/video data, said labels being text conversions of utterances in 
said narration, said labels b e ing created for locating each being 
specifically aligned with corresponding ones of said respective 
subject matter locations within said audio /video data ; and 

a label manager that manages a label mode for generating and storing said 
labels, said label manager also controlling a label search mode for 
utilizing said labels to locate said respective subject matter locations 
in said audio /video data. 

2. (Original) The system of claim 1 wherein said electronic device is implemented 
as an audio /video camcorder device. 
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3. (Original) The system of claim 1 wherein said speech recognition engine is 
configured in a simplified configuration that efficiently compares said narration with 
acoustic models to identify phone strings that represent said narration, said speech 
recognition engine referencing a compact dictionary to look up recognized vocabulary 
words that correspond to said phone strings, said speech recognition engine utilizing 
a limited set of recognition grammar to form said recognized vocabulary words into 
said labels that are supported by said speech recognition engine. 

4. (Original) The system of claim 1 wherein said label manager initially instructs 
said electronic device to enter a real-time label mode for creating and storing said 
labels, said electronic device concurrently capturing said audio/ video data and said 
narration after said label manager instructs said electronic device to enter said real- 
time label mode. 

5. (Original) The system of claim 1 wherein said electronic device enters a real- 
time label mode in response to a verbal label-mode command from a system user, 
said verbal label-mode command being recognized and provided to said label 
manager by said speech recognition engine. 

6. (Original) The system of claim 1 wherein said speech recognition engine 
automatically generates said labels as said electronic device captures said 
audio/video data and said narration. 

7. (Original) The system of claim 1 wherein a post processor performs a post- 
processing procedure upon said labels in a real-time label mode, said post- 
processing procedure including a validation procedure using one or more confidence 
measures to eliminate invalid labels that fail to satisfy pre-determined validation 
criteria. 
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8. (Original) The system of claim 1 wherein said label manager stores said • 
labels during a real-time label mode, said labels being stored along with meta- 
information that associates each of said respective subject matter locations to a 
corresponding one of said labels. 

9. (Original) The system of claim 1 wherein said electronic device initially 
captures said audio /video data and said narration prior to entering said label 
mode. 

10. (Original) The system of claim 1 wherein said label manager instructs said 
electronic device to enter a non-real-time label mode for creating and storing said 
labels, said electronic device responsively retrieving and playing back said 
audio/ video data and said narration. 

1 1. (Original) The system of claim 1 wherein said speech recognition engine 
automatically generates said labels by analyzing said audio/ video data and said 
narration as said electronic device plays back said audio/ video data and said 
narration. 

12. (Original) The system of claim 1 wherein a post processor performs a post- 
processing procedure upon said labels in a non-real-time label mode, said post- 
processing procedure including a validation procedure using one or more confidence 
measures to eliminate invalid labels that fail to satisfy pre-determined validation 
criteria. 

13. (Original) The system of claim 1 wherein said label manager coordinates a 
label validation procedure for validating said labels, said label manager generating a 
validation graphical user interface upon a display of said electronic device for a 
system user to interactively evaluate, delete, and edit said labels. 



4 



14. (Original) The system of claim 1 wherein said label manager coordinates a 
label validation procedure for validating said labels in response to verbal validation 
commands from a system user, said verbal validation commands being recognized 
and provided to said label manager by said speech recognition engine. 

15. (Original) The system of claim 1 wherein said label manager stores said labels 
in a non-real-time label mode, said labels being stored along with meta-information 
that associates each of said respective subject matter locations to a corresponding 
one of said labels. 

16. (Original) The system of claim 1 wherein said label manager instructs said 
electronic device to enter said label search mode during which a system user 
interactively selects a search label for performing a label search procedure to locate a 
specific one of said respective subject matter locations corresponding to said search 
label. 

17. (Original) The system of claim 1 wherein said label manager generates a label- 
search GUI on a display of said electronic device, a system user viewing said labels 
and corresponding representative images from said audio /video data for selecting a 
search label. 

18. (Original) The system of claim 1 wherein a system user selects a search label 
by issuing a verbal search-label command, said verbal search-label command being 
recognized and provided to said label manager by said speech recognition engine. 

19. (Original) The system of claim 1 wherein said label manager instructs said 
electronic device to automatically locate and retrieve a specific one of said respective 
subject matter locations in response to a system user selecting a search label. 
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20. (Original) The system of claim 1 wherein said electronic device automatically 
plays back a specific retrieved one of said respective subject matter locations from 
said audio/ video data for viewing by said system user. 

21. (Currently Amended) A method for cataloguing electronic information, 
comprising: 

capturing audio /video data corresponding to a photographic target by 
utilizing an electronic device, said audio /video data including a 
narration concurrently provided by a narrator specifically to identify 
where respective subject matter locations are positioned in said 
audio /video data; 

providing a speech recognition engine that automatically performs a speech 
recognition process upon said narration to generate text labels that 
correspond to said respective subject matter locations in said 
audio/video data, said text labels being text conversions of 
utterances in said narration, said labels b e ing created for locating 
each being specifically aligned with corresponding ones of said 
respective subject matter locations within said audio /video data : 

managing a label mode for generating and storing said text labels by 
utilizing a label manager; and 

controlling a label search mode with said label manager, said label search 
mode utilizing said text labels to locate said respective subject matter 
locations in said audio /video data. 

22. (Original) The method of claim 21 wherein said electronic device is 
implemented as an audio /video camcorder device. 
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23. (Original) The method of claim 21 wherein said speech recognition engine is 
configured in a simplified configuration that efficiently compares said narration with 
acoustic models to identify phone strings that represent said narration, said speech 
recognition engine referencing a compact dictionary to look up recognized vocabulary 
words that correspond to said phone strings, said speech recognition engine utilizing 
a limited set of recognition grammar to form said recognized vocabulary words into 
said text labels that are supported by said speech recognition engine. 

24. (Original) The method of claim 21 wherein said label manager initially 
instructs said electronic device to enter a real-time label mode for creating and 
storing said text labels, said electronic device concurrently capturing said 
audio/video data and said narration after said label manager instructs said 
electronic device to enter said real-time label mode. 

25. (Original) The method of claim 21 wherein said electronic device enters a real- 
time label mode in response to a verbal label-mode command from a system user, 
said verbal label-mode command being recognized and provided to said label 
manager by said speech recognition engine. 

26. (Original) The method of claim 21 wherein said speech recognition engine 
automatically generates said text labels as said electronic device captures said 
audio /video data and said narration. 

27. (Original) The method of claim 21 wherein a post processor performs a post- 
processing procedure upon said text labels in a real-time label mode, said post- 
processing procedure including a validation procedure using one or more confidence 
measures to eliminate invalid text labels that fail to satisfy pre-determined validation 
criteria. 
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28. (Original) The method of claim 21 wherein said label manager stores said 
text labels during a real-time label mode, said text labels being stored along with 
meta-information that associates each of said respective subject matter locations 
to a corresponding one of said text labels. 

29. (Original) The method of claim 21 wherein said electronic device initially 
captures said audio/video data and said narration prior to entering said label 
mode. 

30. (Previously Presented) The method of claim 21 wherein said label manager 
instructs said electronic device to enter a non-real-time label mode for creating and 
storing said text labels, said electronic device responsively retrieving and playing 
back said audio/video data and said narration. 

3 1 . (Original) The method of claim 2 1 wherein said speech recognition engine 
automatically generates said text labels by analyzing said audio/ video data and said 
narration as said electronic device plays back said audio /video data and said 
narration. 

32. (Original) The method of claim 21 wherein a post processor performs a post- 
processing procedure upon said text labels in a non-real-time label mode, said post- 
processing procedure including a validation procedure using one or more confidence 
measures to eliminate invalid text labels that fail to satisfy pre-determined validation 
criteria. 

33. (Original) The method of claim 21 wherein said label manager coordinates a 
label validation procedure for validating said text labels, said label manager 
generating a validation graphical user interface upon a display of said electronic 
device for a system user to interactively evaluate, delete, and edit said text labels. 
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34. (Original) The method of claim 21 wherein said label manager coordinates a 
label validation procedure for validating said text labels in response to verbal 
validation commands from a system user, said verbal validation commands being 
recognized and provided to said label manager by said speech recognition engine. 

35. (Original) The method of claim 2 1 wherein said label manager stores said text 
labels in a non-real-time label mode, said text labels being stored along with meta- 
information that associates each of said respective subject matter locations to a 
corresponding one of said text labels. 

36. (Original) The method of claim 2 1 wherein said label manager instructs said 
electronic device to enter said label search mode during which a system user 
interactively selects a search label for performing a label search procedure to locate a 
specific one of said respective subject matter locations corresponding to said search 
label. 

37. (Original) The method of claim 2 1 wherein said label manager generates a 
label-search GUI on a display of said electronic device, a system user viewing said 
text labels and corresponding representative images from said audio/ video data for 
selecting a search label. 

38. (Original) The method of claim 2 1 wherein a system user selects a search label 
by issuing a verbal search-label command, said verbal search-label command being 
recognized and provided to said label manager by said speech recognition engine. 

39. (Original) The method of claim 21 wherein said label manager instructs said 
electronic device to automatically locate and retrieve a specific one of said respective 
subject matter locations in response to a system user selecting a search label. 
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40. (Original) The method of claim 21 wherein said electronic device automatically 
plays back a specific retrieved one of said respective subject matter locations from 
said audio/video data for viewing by said system user. 

41. (Currently Amended) A computer-readable medium comprising program 
instructions for cataloguing electronic information by: 

capturing audio /video data corresponding to a photographic target by 
utilizing an electronic device, said audio/video data including a 
narration concurrently provided by a narrator specifically to identify 
where respective subject matter locations are positioned in said 
audio /video data; 

providing a speech recognition engine that automatically performs a speech 
recognition process upon said narration to generate text labels that 
correspond to said respective subject matter locations in said 
audio/video data, said text labels being text conversions of 
utterances in said narration, said labels being created for locating 
each being specifically aligned with corresponding ones of said 
respective subject matter locations within said audio /video data : 

managing a label mode for generating and storing said text labels by 
utilizing a label manager; and 

controlling a label search mode with said label manager, said label search 
mode utilizing said text labels to locate said respective subject matter 
locations in said audio/ video data. 
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42. (Currently Amended) A system for cataloguing electronic information, 
comprising: 

means for capturing audio/video data corresponding to a photographic 
target, said audio /video data including a narration concurrently 
provided by a narrator specifically to identify where respective 
subject matter locations are positioned in said audio/ video data; 

means for automatically performing a speech recognition process upon 
said narration to generate text labels that correspond to said 
respective subject matter locations in said audio/ video data, said 
text labels being text conversions of utterances in said narration, 
said labels being created for locating each being specifically aligned 
with corresponding ones of said respective subject matter locations 
within said audio /video data : 

means for managing a label mode to generate and store said text labels; 
and 

means for controlling a label search mode that utilizes said text labels to 
locate said respective subject matter locations in said audio/ video 
data. 
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43. (Currently Amended) A system for cataloguing electronic information, 
comprising: 

an imaging device that captures audio/ video data corresponding to 

selected photographic targets, said audio/ video data including a 
verbal narration concurrently provided by a narrator specifically to 
identify where respective subject matter locations are positioned in 
said audio /video data; 

a speech recognition engine that automatically performs a speech 

recognition process upon said narration to generate text labels that 
are based upon said narration, said text labels corresponding to said 
respective subject matter locations in said audio/ video data, said 
text labels being text conversions of utterances in said narration, 
said labels b e ing cr e at e d for locating each being specifically aligned 
with corresponding ones of said respective subject matter locations 
within said audio /video data, said text labels including abbreviated 
word sequences that identify said selected photographic targets; and 

a label manager that manages a label mode during which said text labels 
are generated by said speech recognition engine, said label manager 
also storing said text labels during said label mode, said text labels 
being stored along with meta-information that associates said 
respective subject matter locations to corresponding ones of said text 
labels, said label manager also controlling a label search mode for 
utilizing said text labels to locate specific corresponding ones of said 
respective subject matter locations from said audio/ video data, said 
label manager providing a label-search user interface upon a display 
of said imaging device for displaying said text labels and 
corresponding visual images of said respective subject matter 
locations from said audio /video data, a system user interactively 
choosing a selected text label by utilizing said label-search user 
interface, said imaging device responsively displaying said 
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audio/video data from a selected subject matter location 
corresponding only to said selected text label. 

44. (Currently Amended) A system for cataloguing electronic information, 
comprising: 

an electronic device that captures said electronic information that includes 
verbal narration data concurrently provided specifically to identify 
where respective subject matter locations are positioned in said 
audio /video data; 

a speech recognition engine that analyzes said electronic information to 
generate labels that correspond to said respective subject matter 
locations in said electronic information, said labels being text 
conversions of utterances in said verbal narration data, said labels 
being created for locating each being specifically aligned with 
corresponding ones of said respective subject matter locations within 
said audio /video data : and 

a label manager that utilizes said labels to locate said respective subject 
matter locations in said electronic information. 
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45. (Currently Amended) A system for cataloguing electronic information, 
comprising: 

an electronic device that captures audio/video data corresponding to a 
photographic target, said audio /video data including a narration 
concurrently provided by a narrator to specifically identify where 
respective subject matter locations are positioned in said 
audio /video data; and 

a speech recognition engine that automatically performs a speech 

recognition process upon said audio/ video data to generate labels 
that correspond to said respective subject matter locations in said 
audio/video data, said labels being text conversions of utterances in 
said narration, said labels being created for locating each being 
specifically aligned with corresponding ones of said respective 
subject matter locations within said audio /video data . 

46. (Currently Amended) A system for cataloguing electronic information, 
comprising: 

an electronic device that captures audio/ video data corresponding to a 
photographic target, said audio/video data including a narration 
concurrently provided by a narrator specifically to identify where 
respective subject matter locations are positioned in said 
audio /video data; and 

a label manager that controls a label search mode for utilizing labels 
derived from said narration to locate corresponding ones of said 
respective subject matter locations in said audio/ video data, said 
labels being created for locating each being specifically aligned with 
corresponding ones of said respective subject matter locations within 
said audio /video data . 
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47. (Currently Amended) An electronic cataloguing system implemented by: 
capturing electronic data which includes a narration concurrently provided 

by a narrator specifically to identify where respective subject matter 
locations are positioned in said audio/ video data; 
performing a speech recognition process upon said electronic data to 
automatically generate labels that correspond to said respective 
subject matter locations in said electronic data, said labels being text 
conversions of utterances in said narration, said labels being created 
for locating each being specifically aligned with corresponding ones 
of said respective subject matter locations within said audio /video 
data; and 

utilizing said labels to locate said respective subject matter locations in 
said electronic data. 

48. (Previously Presented) The system of claim 8 wherein said meta- 
information includes video timecode information. 

49. (Previously Presented) The system of claim 12 wherein said confidence 
measures include a label amplitude parameter and a label duration parameter. 

50. (Previously Presented) The system of claim 17 wherein said representative 
images are implemented as thumbnail images. 

51. (Previously Presented) The system of claim 1 wherein said electronic device 
is a single discrete video camcorder that hosts said speech recognition engine, 
said label manager, said labels, and said audio/ video data. 
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